Skip to content

Conversation

@Xu-Wenqing
Copy link
Contributor

@Xu-Wenqing Xu-Wenqing commented Sep 2, 2025

Purpose

vLLM LongCat-Flash-Chat model support: #23991

this PR support LongCat-Flash-Chat model tool call.

Test Plan

Test Script (Streaming):

from openai import OpenAI

openai_api_base = ""
openai_api_key = ""

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base + "/v1",
)

class bcolors:
    HEADER = '\033[95m'
    OKBLUE = '\033[94m'
    OKCYAN = '\033[96m'
    OKGREEN = '\033[92m'
    WARNING = '\033[93m'
    FAIL = '\033[91m'
    ENDC = '\033[0m'
    BOLD = '\033[1m'
    UNDERLINE = '\033[4m'


tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_temperature",
            "description": "Get current temperature at a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": 'The location to get the temperature for, in the format "City, State, Country".',
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": 'The unit to return the temperature in. Defaults to "celsius".',
                    },
                },
                "required": ["location"],
            },
            "strict": True
        },
    },
    {
        "type": "function",
        "function": {
            "name": "get_temperature_date",
            "description": "Get temperature at a location and date.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": 'The location to get the temperature for, in the format "City, State, Country".',
                    },
                    "date": {
                        "type": "string",
                        "description": 'The date to get the temperature for, in the format "Year-Month-Day".',
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": 'The unit to return the temperature in. Defaults to "celsius".',
                    },
                },
                "required": ["location", "date"],
            },
        },
    },
]


tool_calls_stream = client.chat.completions.create(
    model=client.models.list().data[0].id,
    messages=[
        {
            "role": "system",
            "content": "现在的日期是: 2024-09-30",
        },
        {
            "role": "user",
            "content": "北京今天的天气如何?明天呢?",
        },
    ],
    tools=tools,
    tool_choice="auto",
    stream=True,
    # extra_body={"chat_template_kwargs": {"thinking": True}},
    max_completion_tokens=8192
)

print("reasoning content(Blue) and content(Green):")
chunks = []
for chunk in tool_calls_stream:
    chunks.append(chunk)
    if hasattr(chunk.choices[0].delta, "reasoning_content"):
        reasoning_content = chunk.choices[0].delta.reasoning_content
        if reasoning_content:
            print(bcolors.OKBLUE + reasoning_content, end="", flush=True)
    elif hasattr(chunk.choices[0].delta, "content"):
        content = chunk.choices[0].delta.content
        if content:
            print(bcolors.OKGREEN + content, end="", flush=True)

print(bcolors.ENDC + "\n### end of reasoning content and content. ###\n")

arguments = []
tool_call_idx = -1
for chunk in chunks:
    if chunk.choices[0].delta.tool_calls:
        tool_call = chunk.choices[0].delta.tool_calls[0]

        if tool_call.index != tool_call_idx:
            if tool_call_idx >= 0:
                print(f"streamed tool call arguments: {arguments[tool_call_idx]}")
            tool_call_idx = chunk.choices[0].delta.tool_calls[0].index
            arguments.append("")
        if tool_call.id:
            print(f"streamed tool call id: {tool_call.id} ")

        if tool_call.function:
            if tool_call.function.name:
                print(f"streamed tool call name: {tool_call.function.name}")

            if tool_call.function.arguments:
                arguments[tool_call_idx] += tool_call.function.arguments

if len(arguments):
    print(f"streamed tool call arguments: {arguments[-1]}")

Test Script (Non-Streaming):

from openai import OpenAI

openai_api_base = ""
openai_api_key = ""

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base + "/v1",
)


tools = [
    {
        "type": "function",
        "function": {
            "strict": True,
            "name": "get_current_temperature",
            "description": "Get current temperature at a location.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": 'The location to get the temperature for, in the format "City, State, Country".',
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": 'The unit to return the temperature in. Defaults to "celsius".',
                    },
                },
                "required": ["location"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "get_temperature_date",
            "description": "Get temperature at a location and date.",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": 'The location to get the temperature for, in the format "City, State, Country".',
                    },
                    "date": {
                        "type": "string",
                        "description": 'The date to get the temperature for, in the format "Year-Month-Day".',
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": 'The unit to return the temperature in. Defaults to "celsius".',
                    },
                },
                "required": ["location", "date"],
            },
        },
    },
]


response = client.chat.completions.create(
    model=client.models.list().data[0].id,
    messages=[
        {
            "role": "system",
            "content": "现在的日期是: 2024-09-30",
        },
        {
            "role": "user",
            "content": "北京今天的天气如何?明天呢?",
        },
    ],
    tools=tools,
    tool_choice="auto",
    stream=False,
)

print(response)
tool_calls = response.choices[0].message.tool_calls
for c in tool_calls:
    print(c.function.name, c.function.arguments)

Test Result

Test Result (Streaming):

reasoning content(Blue) and content(Green):

### end of reasoning content and content. ###

streamed tool call id: chatcmpl-tool-573a577d3b8c4da2bc51e7cd0575bf9e 
streamed tool call name: get_current_temperature
streamed tool call arguments: {"location": "北京, 中国"
streamed tool call id: chatcmpl-tool-4976a639656c4db29282f9b089d15f4c 
streamed tool call name: get_temperature_date
streamed tool call arguments: {"location": "北京, 中国", "date": "2024-10-01"

Test Result (Non-Streaming):

ChatCompletion(id='chatcmpl-4ac67b27-9eca-4089-9477-f51b48c866fd', choices=[Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=[ChatCompletionMessageFunctionToolCall(id='chatcmpl-tool-b8069bd55afe48a8adc3be98a5fdfaa5', function=Function(arguments='{"location": "Beijing, China"}', name='get_current_temperature'), type='function')], reasoning_content=None), stop_reason=None, token_ids=None)], created=1756806588, model='LongCat-Flash-Chat', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=25, prompt_tokens=487, total_tokens=512, completion_tokens_details=None, prompt_tokens_details=None), prompt_logprobs=None, prompt_token_ids=None, kv_transfer_params=None)
get_current_temperature {"location": "Beijing, China"}

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
@mergify mergify bot added documentation Improvements or additions to documentation frontend tool-calling labels Sep 2, 2025
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
@Xu-Wenqing Xu-Wenqing changed the title [WIP] Support LongCat-Flash-Chat tool call Support LongCat-Flash-Chat tool call Sep 2, 2025
@Xu-Wenqing Xu-Wenqing marked this pull request as ready for review September 2, 2025 09:50
@Xu-Wenqing
Copy link
Contributor Author

Waiting for vLLM LongCat-Flash PR merge: #23991

@ghost
Copy link

ghost commented Sep 3, 2025

Hope it can be merged soon.

Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
@chaunceyjiang chaunceyjiang self-assigned this Sep 22, 2025
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
@Xu-Wenqing
Copy link
Contributor Author

Xu-Wenqing commented Sep 25, 2025

@chaunceyjiang @aarnphm @hmellor the related PR: #23991 has been merged, could you please take a review again? Thanks.

Copy link
Collaborator

@chaunceyjiang chaunceyjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks~

@chaunceyjiang chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 26, 2025
@chaunceyjiang chaunceyjiang enabled auto-merge (squash) September 26, 2025 08:00
@chaunceyjiang chaunceyjiang merged commit b03b1b9 into vllm-project:main Sep 26, 2025
54 checks passed
pdasigi pushed a commit to pdasigi/vllm that referenced this pull request Oct 2, 2025
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025
Signed-off-by: 许文卿 <xwq391974@alibaba-inc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation frontend ready ONLY add when PR is ready to merge/full CI is needed tool-calling

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants